Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to enable Kubernetes feature gates and runtimeconfig in KubeVirt CI #1345

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Barakmor1
Copy link
Member

@Barakmor1 Barakmor1 commented Jan 14, 2025

What this PR does / why we need it:
There is a need to test new capabilities, such as the ImageVolume POC and DRA.

This change allows enabling/disabling Kubernetes feature gates and runtime config in the KubeVirt CI . Users can now set the K8S_FEATURE_GATES environment variable before running make cluster-up to enable specific feature gates. For example:

export K8S_FEATURE_GATES="<FeatureGate1>=true,<FeatureGate2>=false"

Users can also set runtime configuration by setting
the K8S_API_RUNTIME_CONFIG environment variable before
running make cluster-up.

For example:

export K8S_API_RUNTIME_CONFIG="RuntimeConfig=true"

This modifies the KubeController, KubeAPI, and Kubelet configurations files, restarts the components, and waits for them to be ready with the feature gates applied.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note:

Add ability to enable Kubernetes feature gates and runtime config in KubeVirt CI

@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Jan 14, 2025
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rmohr for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dosubot dosubot bot added the sig/ci Denotes an issue or PR as being related to sig-ci, marks changes to the CI system. label Jan 14, 2025
@Barakmor1
Copy link
Member Author

Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Barakmor1! Looks awesome!
I think this PR is really important as it will help us test alpha/beta features a lot of time in advance which is a huge advantage in terms of development.

This is a relatively shallow review as I'm not an expert in this area.

p.s. If in the future we'll also be able to support #1264, combining these two capabilities could be absolutely great!

cluster-provision/gocli/opts/featuregate/featuregate.go Outdated Show resolved Hide resolved
cluster-provision/gocli/opts/featuregate/featuregate.go Outdated Show resolved Hide resolved
cluster-provision/gocli/opts/featuregate/featuregate.go Outdated Show resolved Hide resolved
cluster-provision/gocli/opts/featuregate/featuregate.go Outdated Show resolved Hide resolved
@jean-edouard
Copy link
Contributor

/cc @alicefr @victortoso @jean-edouard @brianmcarey

This PR makes sense to me. It's just surprising and unfortunate that we need that much code just to add a couple feature gates to k8s.
Let's just try and keep the code as simple as possible.
Please ping me once every review comment has been addressed and I'll approve.

@iholder101
Copy link
Contributor

/cc @alicefr @victortoso @jean-edouard @brianmcarey

This PR makes sense to me. It's just surprising and unfortunate that we need that much code just to add a couple feature gates to k8s. Let's just try and keep the code as simple as possible. Please ping me once every review comment has been addressed and I'll approve.

@Barakmor1 Maybe there's a way to reduce code and make things a bit cleaner.
@victortoso used the concept of config file drop-ins [1] as part of #1299 which can also be used here. Briefly, it allows having multiple config files that would be merged later by kubelet. So instead of changing the config, you can add a new config (e.g. 05-k8s-feature-gates.conf) with the following contents:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
featureGates:
- NodeSwap

WDYT?

[1] https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/#kubelet-conf-d

this allow enabling\disabling feature gates by setting
the `K8S_FEATURE_GATES` environment variable before
running `make cluster-up`.

For example:

export K8S_FEATURE_GATES="FeatureGate1=true,FeatureGate2-true"

You can	also set runtime configuration 	by setting
the `K8S_API_RUNTIME_CONFIG` environment variable before
running `make cluster-up`.

For example:

export K8S_API_RUNTIME_CONFIG="RuntimeConfig=true"

This will update the configuration of the kube-controller, kube-apiserver,
and kubelet to include the flags. These components will restart,
and this logic will wait until they are ready again.

No featureGate or runtimeConfig	are enabled by default

Signed-off-by: bmordeha <[email protected]>
@Barakmor1 Barakmor1 changed the title Add ability to enable Kubernetes feature gates in KubeVirt CI Add ability to enable Kubernetes feature gates and runtimeconfig in KubeVirt CI Jan 16, 2025
@Barakmor1
Copy link
Member Author

@iholder101 @victortoso Please have another look

@Barakmor1
Copy link
Member Author

Barakmor1 commented Jan 16, 2025

/cc @alicefr @victortoso @jean-edouard @brianmcarey

This PR makes sense to me. It's just surprising and unfortunate that we need that much code just to add a couple feature gates to k8s. Let's just try and keep the code as simple as possible. Please ping me once every review comment has been addressed and I'll approve.

@Barakmor1 Maybe there's a way to reduce code and make things a bit cleaner. @victortoso used the concept of config file drop-ins [1] as part of #1299 which can also be used here. Briefly, it allows having multiple config files that would be merged later by kubelet. So instead of changing the config, you can add a new config (e.g. 05-k8s-feature-gates.conf) with the following contents:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
featureGates:
- NodeSwap

WDYT?

[1] https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/#kubelet-conf-d

Hey, unfortunately, I couldn’t find an easy way to update Kubernetes components with feature gates and flags after the cluster is already deployed.

Most of the work in the PR is about changing the configurations of the kube-apiserver, kube-controller, kube-scheduler, and kubelet, and then making sure these changes are applied correctly. The final step is to ensure the components are ready again before moving on with other kubevirt CI tasks.

Using the drop-in would only change how the kubelet configuration is updated, but we’d still need to restart kubelet and verify that kubelet is ready. So, I don’t think this would simplify the code.
Please let me know if I missed something or if you have another idea,and I’d be happy to try it out! :)

Copy link
Member

@victortoso victortoso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the drop-in would only change how the kubelet configuration is updated, but we’d still need to restart kubelet and verify that kubelet is ready. So, I don’t think this would simplify the code.

Right, so the kubevirtci image is already created when we do something like K8S_FEATURE_GATES=DRA make cluster-up and we would need to restart kubelet after copying the config file, on each worker node.

Still, we do have some logic that does something like that, see configure_cpu_manager introduced in 2981e7e

I'm not sure what is the best way forward. I think that, k8s FG requirements can change between k8s versions and if we want to help developers try those with kubevirtci, we should probably have to handle custom k8s configs and manage to update the cluster's components with it.

Drop-in would be great, but afaict not all k8s components have it and even kubelet support is beta. What about using kubeadm config?

KUBEVIRTCI_LOCAL_TESTING.md Show resolved Hide resolved
@Barakmor1
Copy link
Member Author

Using the drop-in would only change how the kubelet configuration is updated, but we’d still need to restart kubelet and verify that kubelet is ready. So, I don’t think this would simplify the code.

Right, so the kubevirtci image is already created when we do something like K8S_FEATURE_GATES=DRA make cluster-up and we would need to restart kubelet after copying the config file, on each worker node.

Still, we do have some logic that does something like that, see configure_cpu_manager introduced in 2981e7e

Please take a look at the prepareCommandsForConfiguration function in cluster-provision/gocli/opts/k8scomponents/kubelet.go. This is where the kubelet restart is handled in this PR.

I'm not sure what is the best way forward. I think that, k8s FG requirements can change between k8s versions and if we want to help developers try those with kubevirtci, we should probably have to handle custom k8s configs and manage to update the cluster's components with it.

Drop-in would be great, but afaict not all k8s components have it and even kubelet support is beta. What about using kubeadm config?

Well, The way I see it, this PR adds the option to include useful flags for Kubernetes components: API server, kube-controller-manager, kube-scheduler, and kubelet. If we want to modify the way these flags are being added for components or add new ones for any of the components in the future, we can do so by updating prepareCommandsForConfiguration function in the respective component's file:

For kubelet: cluster-provision/gocli/opts/k8scomponents/kubelet.go

For the API server: cluster-provision/gocli/opts/k8scomponents/kube-api.go

For kube-controller-manager: cluster-provision/gocli/opts/k8scomponents/kubecontrollermanager.go

For kube-scheduler: cluster-provision/gocli/opts/k8scomponents/kube-scheduler.go

I'm not certain if this is the optimal way to add the flags, but I aimed to make it easy to modify or extend in the future.

Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Barakmor1, looks great!
Can this PR be tested with a feature gate somehow to ensure it's working before we merge it? Perhaps create a cloned draft PR that enables an FG?

Comment on lines +34 to +35
prepareCommandsForConfiguration() error
runCommandsToConfigure() error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can squash these into one function. I suggest calling it configureComponent

Comment on lines +24 to +28
addFeatureGatesFieldToKubeletConfigCommand = "sudo echo -e 'featureGates:' >> /var/lib/kubelet/config.yaml"
addFeatureGatesToKubeletConfigCommandFormatFormat = `sudo sed -i 's/featureGates:/featureGates:\n %s/g' /var/lib/kubelet/config.yaml`
kubeletRestartCommand = "sudo systemctl restart kubelet"

featureGateExistInKubeletError = "feature gates should not exist in kubelet by default"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these variables to kubelet.go?
Similarly, componentKubeAPIServer and everything component specific?

searchFeatureGatesInFileFormat = "awk '/feature-gates/' %s"
getComponentCommandFormat = "kubectl --kubeconfig=/etc/kubernetes/admin.conf get pods -n kube-system -l component=%s -o jsonpath='{.items[0].spec.containers[*].command}'"
getComponentReadyContainersFormat = "kubectl --kubeconfig=/etc/kubernetes/admin.conf get pods -n kube-system -l component=%s -o=jsonpath='{.items[0].status.containerStatuses[*].ready}'"
getNodeReadyStatusCommand = "kubectl --kubeconfig=/etc/kubernetes/admin.confget nodes -o=jsonpath='{.items[*].status.conditions[?(@.type==\"Ready\")].status}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--kubeconfig=/etc/kubernetes/admin.confget -> --kubeconfig=/etc/kubernetes/admin.conf get

componentKubeScheduler componentName = "kube-scheduler"

searchFeatureGatesInFileFormat = "awk '/feature-gates/' %s"
getComponentCommandFormat = "kubectl --kubeconfig=/etc/kubernetes/admin.conf get pods -n kube-system -l component=%s -o jsonpath='{.items[0].spec.containers[*].command}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--kubeconfig=/etc/kubernetes/admin.conf

isn't this the default value for this flag?

Comment on lines +66 to +68
if waitingForFeatureGate && !strings.Contains(output, "feature-gate") {
return false, fmt.Sprintf("modifying flags, waiting for %v pods to restart\n", component)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When changing feature gates via a config, would they shop up as a CLI argument?

)

type component interface {
verifyComponent() error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
verifyComponent() error
validateComponent() error

Comment on lines +108 to +126
func searchFeatureGatesInFile(file string) string {
return fmt.Sprintf(searchFeatureGatesInFileFormat, file)
}

func getComponentCommand(component componentName) string {
return fmt.Sprintf(getComponentCommandFormat, component)
}

func getComponentReadyContainers(component componentName) string {
return fmt.Sprintf(getComponentReadyContainersFormat, component)
}

func addFlagsToComponentCommand(component componentName, flag string) string {
return fmt.Sprintf(addFlagsToComponentCommandFormat, component, flag, component)
}

func addFeatureGatesToKubeletConfigCommand(feature string) string {
return fmt.Sprintf(addFeatureGatesToKubeletConfigCommandFormatFormat, feature)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove the consts and inline them into here since that's their only usage

RunSpecs(t, "FeatureGatesOpt Suite")
}

type cmdOutPut struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
type cmdOutPut struct {
type cmdOutput struct {

Comment on lines +37 to +47
if k.featureGates != "" {
k.cmds = append(k.cmds, addFeatureGatesFieldToKubeletConfigCommand)
keyValuePairs := strings.Split(k.featureGates, ",")
var formattedFeatureGates []string
for _, pair := range keyValuePairs {
formattedFeatureGates = append(formattedFeatureGates, strings.Replace(pair, "=", ": ", 1))
}
for _, fg := range formattedFeatureGates {
k.cmds = append(k.cmds, addFeatureGatesToKubeletConfigCommand(fg))
}
k.cmds = append(k.cmds, kubeletRestartCommand)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that using kubelet drop-ins only for Kubelet would make the code much easier to understand as it's pretty elegant and more correct. And since you're managing the different components with interface instances, and since kubelet either way is implemented differently than the other components, I think it would be the right way to go.

Copy link
Member Author

@Barakmor1 Barakmor1 Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, although I don't think this should block the PR, considering other places (swap and cpu-manager) where we are already modifying kubelet without the drop-in method. However, I see there is consensus on modifying kubelet with the drop-in beta feature, which is not yet supported until we add the --config flag to the kubeadm command during node provisioning which will be done in this PR of @victortoso .
I'll wait until it's merged and then use the drop-in method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is not yet supported until we add the --config flag to the kubeadm command during node provisioning which will be done in #1299 of @victortoso .
I'll wait until it's merged and then use the drop-in method.

Not sure there's a need to wait, you can just add the --config-dir in this PR.

I agree, although I don't think this should block the PR

I agree. Although I'm for it, if you feel strongly about it I don't think it should block the PR.
Worst case we can do that as a follow-up.

Comment on lines +57 to +74
func (k *kubeletComponent) waitForComponentReady() error {
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
timeoutChannel := time.After(3 * time.Minute)
select {
case <-timeoutChannel:
return fmt.Errorf("timed out after 3 minutes waiting for node to be ready")
case <-ticker.C:
output, err := k.sshClient.CommandWithNoStdOut(getNodeReadyStatusCommand)
if err == nil && !strings.Contains(output, "false") {
return nil
}
if err != nil {
fmt.Printf("Modifying kubelet configuration, API server not responding yet, err: %v\n", err)
}
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is duplicated, you can extract it and get a function as a parameter that would do the actual validation. This way you'll be able to use it for all components.

@Barakmor1
Copy link
Member Author

Barakmor1 commented Jan 20, 2025

Can this PR be tested with a feature gate somehow to ensure it's working before we merge it? Perhaps create a cloned draft PR that enables an FG?

This would only take affect if K8S_FEATURE_GATES or K8S_API_RUNTIME_CONFIG is set, i already tested it manually and it worked for, I me not sure about if we can test it through KCI lanes

@Barakmor1 Barakmor1 closed this Jan 20, 2025
@Barakmor1
Copy link
Member Author

/reopen
closed by mistake

@kubevirt-bot kubevirt-bot reopened this Jan 20, 2025
@kubevirt-bot
Copy link
Contributor

@Barakmor1: Reopened this PR.

In response to this:

/reopen
closed by mistake

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot
Copy link
Contributor

@Barakmor1: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
check-provision-k8s-1.31-s390x 8f915b2 link false /test check-provision-k8s-1.31-s390x
check-up-kind-sriov 8f915b2 link false /test check-up-kind-sriov
check-provision-k8s-1.32 8f915b2 link true /test check-provision-k8s-1.32

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. sig/ci Denotes an issue or PR as being related to sig-ci, marks changes to the CI system. size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants